Semi-Supervised Learning of Named Entity Substructure

نویسندگان

  • Alden Timme
  • Richard Socher
چکیده

The goal of this project was two-fold: (1) to provide an algorithm to correctly find and label named entities in text, and (2) to uncover substructure in the named entities (such as a first name, last name distinction among person entities). The underlying algorithm used is a Class Hidden Markov Model (CHMM), a Hidden Markov Model with hidden states that emit observed words as well as observed classes. This algorithm is further bolstered by incorporating features into the model, substituting the multinomial probability distributions for transitions and emissions in the model with the outputs of logistic regressions using the features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Semi-supervised Learning Approach to Arabic Named Entity Recognition

We present ASemiNER, a semisupervised algorithm for identifying Named Entities (NEs) in Arabic text. ASemiNER does not require annotated training data, or gazetteers. It also can be easily adapted to handle more than the three standard NE types (Person, Location, and Organisation). To our knowledge, our algorithm is the first study that intensively investigates the semi-supervised pattern-based...

متن کامل

Data Analysis Project: Semi-Supervised Discovery of Named Entities and Relations from the Web

This project studies semi-supervised discovery of named entities, relational entities and prepositional phrase attachments within a read-the-web framework. Meanings of an entity can be improvised and updated faster in the internet world than printed references. The main idea of this project is to study the feasibility of characterizing entities by web content directly. The approach is that cont...

متن کامل

Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model

Named entity recognition (NER) plays an important role in the NLP literature. The traditional methods tend to employ large annotated corpus to achieve a high performance. Different with many semi-supervised learning models for NER task, in this paper, we employ the graph-based semi-supervised learning (GBSSL) method to utilize the freely available unlabeled data. The experiment shows that the u...

متن کامل

Minimally-supervised methods for Arabic Named Entity Recognition

Supervised methods can achieve high performance on NLP tasks, such as Named Entity Recognition (NER), but new annotations are required for every new domain and/or genre change. This has motivated research in minimally supervised methods such as semisupervised learning and distant learning, but neither technique has yet achieved performance levels comparable to those of supervised methods. Semi-...

متن کامل

A Simple Semi-supervised Algorithm For Named Entity Recognition

We present a simple semi-supervised learning algorithm for named entity recognition (NER) using conditional random fields (CRFs). The algorithm is based on exploiting evidence that is independent from the features used for a classifier, which provides high-precision labels to unlabeled data. Such independent evidence is used to automatically extract highaccuracy and non-redundant data, leading ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010